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ABSTRACT 


The  estimation  of  mixing  proportions  in  the  mixture  model  is 
discussed  with  emphasis  on  the  mixture  of  two  normal 
components  with  all  five  parameters  unknown.  Simulations  are 
presented  which  compare  minimum  distance (MD)  and  maximum 
likelihood (ML)  estimation  of  the  parameters  of  this 
mixture-of-normals  model.  Some  practical  issues  of 
implementation  of  these  results  are  also  discussed. 
Simulation  results  indicate  that  ML  techniques  are  superior 
to  MD  when  component  distributions  actually  are  normal, 
while  MD  techniques  provide  better  estimates  than  ML  under 
symmetric  departures  from  component  normality.  Results  are 
presented  which  establish  strong  consistency  and  asymptotic 
normality  of  the  MD  estimator  under  conditions  which  include 
the  mixture-of-normals  model.  Asymptotic  variances  and 
relative  efficiencies  are  obtained  for  further  comparison  of 
the  MDE  and  MLE. 

A" 

;  i.'.n 

i  _  -  -incvd  G 

.  •  > fiaatlrn - 


%  i  ";  'lt  r/ 

;  1 :  v  Co  'ea 

.  •  •  _  GJi.l/or 


i 


MINIMUM  DISTANCE  ESTIMATION  OF  MIXTURE  MODEL  PARAMETERS  - 
ASYMPTOTIC  RESULTS  AND  SIMULATION  COMPARISONS 
WITH  MAXIMUM  LIKELIHOOD 


Wayne  A.  Woodward,  william  C.  Parr, 

William  R.  Schucany,  and  Henry  L.  Gray 

1.  Introduction 

An  important  problem  in  aerospace  remote  sensing  is  the 
estimation  of  the  mixing  proportions  p^ ,p2 , . . . ,  pm  in  the 
mixture  density 

f(x)  ®  p1f1(x)  +  p2f2(x)  +  ...  +  pmfm(x) 

where  m  is  the  number  of  components (crops)  in  the  mixture 
and  for  component  i,f^(x)  is  a  density.  The  variable  of 
interest,  X,  is  some  measurement  such  as  the  reflected 
energy  in  four  bands  of  the  light  spectrum  as  measured  by 
the  LANDSAT  satellite,  certain  linear  combinations  of  these 
readings,  or  other  derived  "feature”  variables. 

Generally,  parameter  estimation  in  mixture  model 
applications  has  been  accomplished  by  assuming  that  the 
component  distributions  are  normal  and  using  maximum 
likelihood (ML)  techniques.  In  a  recent  report.  Woodward,  et. 
al. (1982)  have  examined  the  use  of  minimum  distance (MD) 
estimation  based  on  the  Cram£r-von  Mises  distance,  as  an 
alternative  to  maximum  likelihood.  Both  ML  and  MD  estimation 
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schemes  in  that 
univariate  normal 
given  by 

f  (x)  =  — £■ 
/lit 


paper  were  based  upon  the  mixture  of  two 
distributions  whose  density  function  is 
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where  all  5  parameters  Pi ,  olf  w2/  °2'  and  P  are  unknown.  It 
was  also  assumed  that  no  training  data  are  available,  i.e., 
the  only  observations  are  from  the  mixture  distribution.  In 
this  setting,  motivated  by  the  crop  example,  p  is  the 
parameter  of  paramount  importance  while  location  and  scale 
of  the  components  are  nuisance  parameters.  Woodward,  et.  al. 
(1982).  compare  ML  and  MD  estimation  techniques  on  simulated 
mixtures  of  normal,  t (4 ) ,  and  chi-square(9)  densities  with 
varying  amounts  of  separation.  The  results  indicate  that  the 
MDE  is  more  robust  than  the.  MLE  to  symmetric  departures  from 
component  normality,  while  neither  technique  provides 
satisfactory  results  when  component  distributions  are 
skewed. 

In  this  report,  we  present  further  simulation  results 
comparing  ML  and  MD  estimation  of  the  mixing  proportion 
based  on  a  mixture-of-normals  model,  when  in  fact  the 
component  distributions  are  not  normal,  yet  represent 
symmetric  departures  from  normality.  Unless  otherwise 
indicated,  reference  to  the  MDE  in  this  report  will  involve 
the  use  of  Cram6r-von  Mises  distance.  We  also  present 
asymptotic  results  which  establish  the  strong  consistency 
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and  asymptotic  normality  of  HD  estimators  of  the  parameters 
in  the  mixture-of-normals  model,  and  finally  provide 
asymptotic  relative  efficiencies  for  comparing  the  MLE  and 
MDE  in  this  setting. 


2.  Simulation  Results 

In  this  section  we  report  the  results  of  a  Monte  Carlo 
study  designed  to  compare  the  ML  and  MD  estimators  based 
upon  a  mixture-of-normals  when  the  simulated  component 
distributions  are  normal  and  when  they  are  non-normal.  These 
comparisons  are  made  under  varying  degrees  of  separation 
between  the  two  component  distributions.  All  computations 
were  performed  on  the  CDC  6600  at  Southern  Methodist 
University. 

In  these  simulations,  the  mixing  proportion,  p,  takes 
on  the  values  .25,  .50,  and  .75.  For  a  given  mixture,  the 
component  distributions  differ  from  each  other  only  in 
location  and  scale.  In  particular,  fj_ (x)  is  taken  to  be  the 
density  associated  with  a  random  variable  X*aY  while  f2(x) 
is  the  density  for  X»Y+b  where  a>0,  b>0.  Thus,  a  is  the 
ratio  of  scale  parameters  for  the  densities  f^  and  f2 ,  and 
similarly,  b  is  the  difference  in  location  parameters.  The 
random  variable  Y  in  our  simulations  is  either  normal. 
Student's  t  with  2  or  4  degrees  of  freedom,  or  double 
exponential.  In  our  simulations  we  use  a»l  and  a*  /2  while  b 
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is  selected  to  provide  the  desired  separation  between  the 
component  distributions.  The  number  of  modes  of  the  mixture 
density  depends  to  a  large  extent  on  this  separation  between 
the  two  component  distributions.  Although/  for  sufficient 
separation,  the  mixture  model  has  a  characteristic  bimodal 
shape,  the  density  may  by  unimodal  when  there  is  only 
moderate  separation  between  the  components,  and  in  this 
case,  parameter  estimation  is  more  difficult  than  it  is  in 
the  bimodal  cases.  For  purposes  of  quantifying  this 
separation  between  the  components,  a  measure  of  "overlap" 
between  two  distributions  was  defined  by  Woodward  et. 
al. (1982) . 

For  each  set  of  parameter  configurations,  500  samples 
of  size  n=100  were  generated  from  the  corresponding  mixture 
distribution.  Simulations  were  based  on  the  IMSL 
multiplicative  congruential  uniform  random  number  generator 
GGUBS.  Normal  component  observations  were  generated  using 
IMSL  subroutine  GGNPM  which  uses  the  polar  method,  while 
t(n)  observations  were  based  on  the  ratio  of  independent 
chi-square  and  normal  deviates,  each  obtained  using  IMSL 
routines.  Double  exponential  components  were  based  on  ln(D) 
where  U  is  uniforra(0,l) ,  and  randomly  assigning  either  a 
positive  or  negative  sign.  In  all  cases,  observations  from 
the  basic  component  distribution  under  investigation  were 
simulated  and  then  assigned  to  either  component  1  or 
component  2  depending  upon  whether  an  independent 


5 


uniforra(0 , 1)  was  less  than  or  greater  than  p.  The 
observations  were  then  scaled  and  shifted  (with  a  and  b)  to 
provide  observations  from  the  appropriate  component. 

For  each  sample  simulated,  both  the  MDE  and  MLE  were 
obtained.  The  iterative  procedures  discussed  by  Woodward  et. 
al.  (1982)  were  implemented  in  such  a  way  that  acceptable 
parameter  estimates  are  obtained  for  each  sample.  For 
example,  if  the  iterative  procedure  fails  to  converge  in  the 
specified  number  of  iterations,  the  last  value  obtained  in 
the  iteration  is  taken  to  be  the  estimate  if  this  value  is 
"reasonable"  according  to  preset  criteria.  In  general,  if 
any  of  the  following  conditions  existed  at  any  step  in  the 
iteration, 


iteration  is  terminated  and  the  corresponding  estimate  is 
taken  to  be  the  starting  value.  This  did  not  occur  in  any  of 
the  500  repititions,  for  most  configurations,  but  did  occur 
a  maximum  of  7  times  out  of  500  for  HD  estimates  of  the 
parameters  of  a  mixture  of  t(2)  components.  The  extreme 
observations  which  occasionally  appear  in  samples  from  t (2) 
mixtures,  also  forced  a  modification  in  the  first  step  of 
the  MLE  iteration  to  avoid  a  division  by  zero.  Although  both 
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estimation  procedures  provide  estimates  of  all  5  of  the 
parameters,  only  the  results  for  estimation  of  p  will  be 
tabulated  since  the  mixing  proportion  is  the  parameter  of 
primary  interest,  as  previously  mentioned.  In  addition,  when 
dealing  with  the  non-normal  mixtures,  the  remaining 
parameter  estimates  often  do  not  have  a  meaningful 
interpretation. 

In  Table  1  we  present  summary  results  of  the 
simulations  comparing  the  performance  of  the  MLE  and  MDE  for 
mixtures  of  normal  components  while  in  Table  2  we  display 
the  results  for  the  non-normal  components.  The  results  for 
normal  and  t ( 4 )  components  were  previously  given  in  Woodward 
et.  al. (1982) .  Estimates  of  the  bias  and  MSE  based  upon  the 
simulations  are  given  by: 

1  >  - 

Bias  =  —  l  (p±-p) 
s  i=l 


and 


n 


1  r  /* 

«SE  =  —  l  (Pi“P) 


3i«l 


where  ns  is  the  number  of  samples,  and  pt  denotes  an 
estimate  of  p  for  the  ith  sample.  It  should  be  noted  that 
nMSE  is  the  quantity  actually  given  in  the  tables  since  this 
facilitates  comparison  with  asymptotic  variances  in 
Section  4.  Since  the  MLE  and  MDE  are  both  asymptotically 
unbiased  (this  will  be  discussed  for  the  MDE  in  the  next 
section) ,  ngMSE/o 2  is  approximately  X2(500).  It  is  easy  to 
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Table  2.  Simulation  Results  for  Mixtures 
of  Non-normal  Components 

Sample  size  =  100 

Number  of  replications  »  500 

Double  Exponential  Components 


Overlap 

=»  .10 

Overlap 

=  .03 

Ratio 
of  Scale 
p  Factors (a) 

Bias 

nMSE 

E 

MDE 

Closer 

Bias 

nMSE 

£ 

MDE 

Closer 

MDE 

.054 

2.96 

2.13 

.66 

.030 

.545 

1.18 

.50 

.25 

1 

MLE 

.091 

6.31 

.026 

.645 

Start 

.065 

1.40 

.078 

1.04 

MDE 

.007 

1.03 

4.04 

.69 

-.001 

.286 

1.29 

.54 

.50 

1 

MLE 

.007 

4.16 

-.001 

.368 

Start 

-.004 

1.17 

.000 

.414 

MDE  " 

.102 

4.42 

1.40 

.60 

.775 

1.07 

.48 

.25 

Si 

MLE 

.034 

6.17 

mSSM 

.832 

Start 

.011 

.926 

1  .050 

.678 

MDE 

.032 

1.50 

2.71 

.68 

.259 

1.44 

.58 

.50 

Si 

MLE 

.073 

4.06 

.372 

Start 

-.088 

1.86 

.570 

MDE 

-.037 

2.20 

2.94 

.73 

BEES 

.344 

.94 

.44 

.75 

MLE 

-.067 

6.47 

.323 

mmmmm 

Start 

-.151 

3.31 

. . . 

1  -.107 

1.63 

t(4)  Components 


MDE 

6. 18 

1.19 

.61 

.466 

1.89 

.49 

.25 

1 

MLE 

yggfgj 

7.35 

.883 

Start 

1.59 

BEE  9 

.998 

MDE 

.004 

1.82 

3.07 

.69  H 

!  .ooo 

.266 

1.64 

.53 

.50 

1 

MLE 

.015 

559 

.436 

Start 

.006 

1.21 

-.001 

.496 

Si 

MDE 

.098 

.89 

.53 

.029 

.605 

1.61 

.49 

.25 

MLE 

.061 

4.63 

.044 

.976 

Start 

-.010 

.810 

.036 

.654 

Si 

MDE 

.001 

.300 

1.85 

.55 

.50 

MLE 

■ 

.010 

.554 

Start 

BBS 

-.046 

.778 

Si 

MDE 

-.058 

3.68 

2.13 

.65 

-.016 

.361 

1.57 

.50 

.75 

MLE 

-.076 

7.84 

-.012 

.567 

Start 

-.137 

3.07 

-.108 

1.75 

tabled 


show  then,  that  the  approximate  standard  error  of  a 

A  * 

nMSE  is  ( . 0632) (nMSE) .  In  addition,  we  also  provide  the 
ratio 

A 

£  =  MSE(MLE) 

MSE(MDE) 

as  an  empirical  relative  efficiency  measure. 

In  order  to  take  advantage  of  the  paired  nature  of  cur 
ML  and  MD  estimates,  we  counted  the  proportion  of  samples 

A  A  A  A 

for  which  pD  is  closer  to  p  than  is  pL> where  pQ  and  pL 
denote  the  MD  and  ML  estimates  respectively.  We  present  this 
proportion  in  the  tables  under  the  heading  "MDE  Closer”. 

a  a 

This  provides  an  estimate  of  P { | PD~P I < i PL“P I )  •  The 
standard  error  of  the  binomial  proportions  shown  in  the 
tables  is  no  greater  than  /  =  .022. 

Analyzing  the  results,  and  as  can  be  seen  by 
inspection,  we  find  that  the  estimated  Bias  and  MSE 
associated  with  the  MLE  were  generally  smaller  than  those 
for  the  MDE  when  the  components  were  actually  normally 
distributed.  This  relationship  between  the  estimators  held 
for  both  overlaps.  The  MLE  and  MDE  were  quite  similar  at 
p=.5  while  for  p=.25  and  p=.75  the  superiority  of  the  MLE  is 
more  pronounced. 

For  the  mixtures  of  non-normal  components,  the 
relationship  between  MDE  and  MLE  is  reversed  in  that  the  MDE 
generally  has  the  smaller  estimated  Bias  and  MSE,  especially 
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for  t ( 2 )  mixtures.  The  superiority  of  the  MDE  is  due  in  pert 
to  the  heavy  tails  in  these  components.  The  MLE  often 
interpreted  an  extreme  observation  as  being  the  only  sample 
value  from  one  of  the  populations  with  all  remaining 
observations  belonging  to  the  ether.  Due  to  the  well  known 
singularities  associated  with  a  zero  variance  estimate  for  a 
component  distribution/  Day{1969)/  we  were  concerned  that 
the  observed  behavior  of  the  MLE  was  due  to  the  fact  that 
the  variances  were  not  constrained  away  from  zero. 
However/ simulation  results  in  which  equal  variances  were 
assumed  (which  removes  the  singularity)  and  also  those  that 
used  a  penalized  MLE  suggested  by  Redner(1980)  were  very 
similar  to  those  quoted  here. 

A  surprising  result  which  was  previously  noted  by 
Woodward  et.  al. (1982)  is  that  the  starting  values  obtained 
using  the  procedure  outlined  in  Section  3  produced 
estimators  that  were  competitive  with  both  the  MLE  and  MDE. 
For  both  the  normal  and  non-normal  mixtures,  the  MSEs 
associated  with  the  starting  values  were  generally  lower 
than  those  for  the  MDE  and  MLE  when  overlaps. 10.  However, 
when  overlap*. 03,  the  starting  value  estimates  were 
generally  poorer  than  those  for  the  MDE  and  MLE,  except  for 
the  t (2)  mixtures  for  which  the  MLEs  were  the  poorest. 
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3.  Asymptotic  Distribution  Theory  for  Minimum 
Cram^r-von  Mises  Distance  Estimation 

Asymptotic  theory  for  minimum  Cramer-von  Mises  distance 
estimators  for  location  parameters  can  be  found  in  Parr  and 
Schucany (1980) ,  and  for.  the  general  one  parameter  case  in 
Parr  and  deWet(1981).  Bolthausen(1977 )  gives  results  for 
the  mutiparameter  case,  but  with  conditions  which  are  so 
strict  as  to  rule  out  scale  parameters  for  unbounded  random 
variables  (see  his  condition  III).  The  purpose  of  the 
results  in  this  section  is  to  extend  this  previous  work  to 
cover  multiparameter  situations  including,  among  others,  the 
problem  of  normal  mixtures. 

Assume  that  at  stage  n  we  observe  real-valued  ^  , 
Xo,...,X_  iid  from  a  distribution  with  cdf  G  and  let  G 
denote  the  usual  empirical  distribution  function.  Let 
:0e©CRki  the  projection  model,  be  a  family  of 
continuous  distribution  functions  and  assume  that  GE<?, 
i.e.,  G“Fg  for  some  0Qe0  .  Further,  assume  that  there 
exists  an  open  set  AC 6  with  9QcA  .  Also  consider  the 
following  continuity (C)  and  dif f erentiability (D)  conditions: 

(C)  If  8ne0,  n  »  1,2,...,  then 

m 

lim  /  (Ffl  (x)  -  F-  (x))2dF-  (x)  «  0 
n—  -i  n  80  ®0 

implies  lim  8  ■  8fl. 

mam  '* 
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]r 

(D)  There  exists  a  function  n:  (0,1)  -*•  R  such  that 
sup  |Fg(x) -Fe  (X)  -  (6-e  )  *n(Fe  (x))  I  =o(|  !0-eo|  i) 

as  |  j0-8Qi  |  -*■  0,  where  j|»j|  is  the  usual  Euclidean 

1  2 

norm  on  RK,  and  /  n- (u)du  <  <*>  for  i  =  where 

0  1 

n'(u)  =  (n^u),  n2(u)  #-**#nk(u))  . 


Notes: 

1)  Condition  C  is  satisfied  if,  for  instance,  F  (x)  is 

0 

continuous  in  0  at  8 Q ,  pointwise  in  x  (use  dominated 
convergence) .  It  can  be  interpreted  as  requiring  that  0 
"continuously  parametrize 

2)  If  condition  C  is  not  satisfied,  then  this  implies 
sup  IF. (x)-F.  (x) |  can  be  arbitrarily  small  without  having  6 

-«<X<®  0  0Q 

approach  0Q  .  In  such  a  case,  the  search  for  any  consistent 
estimator  seems  hopeless.  In  particular,  in  such  a 
situation,  any  consistent  estimating  functional  must  be 
discontinuous  with  respect  to  the  sup-norm,  and  hence  highly 
nonrobust. 

3)  Condition  D  is  weaker  than  (implied  by)  quadratic 

1/2 

mean  differentiability  of  f9  -  the  canonical  regularity 

condition  for  asymptotic  normality  of  the  maximum  likelihood 

estimator  (see  LeCam  (1970)  and  Pollard  (1980)). 

3Fq(x) 

4)  Usually,  n^(u)«  yg* ■  and  condition  D  simply 

1  x«F9  (u) 
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states  the  uniform  validity  of  the  first  order  Taylor 
approximation  to  FQ  (x) .  If  k*l  and  9  is  a  location 
parameter,  a  sufficient  condition  to  imply  D)  is  that 
possess  a  uniformly  continuous  density. 

Before  continuing  define  the  kxk  symmetric  matrices  A 
and  B  by 

A  -  C.y)  ,  B  =  (by) 


with  =  /  n.  (u)n  .(u)du 

X  j  /\  i  3 


0 

1  1 


and  b.  •  -  (  /  (min  (u,  v)  -  uv}  n  .  (u)  n  •  (v)  dudv 
iD  i  i  1  3 


0  0 


and  assume  A  to  be  of  full  rank.  We  can  now  state  and 
outline  the  proof  of  the  following  strong  consistency  and 
asymptotic  normality  results. 


Theorem  1:  Let  8_  be  a  minimum  distance  estimator  of  8  for 
-  n 

all  n-1,  2,  ...  .  Then,  if  condition  C  holds,  8n  -  with 

probability  one. 


Proof :  Clearly,  /  (Gn-F0  )  dF0^ -*■ 

and  hence  also  inf  /  (G  -Ffl)'2dFft 
8e©  n  9  9 

Now, 


0  with  probability  one, 

♦  0  with  probability  one. 


SUP|/(Gn-F0)2dF6  -  /(F-Fg)2dF6|  <  4  suplGn(t)-FQ  (t) |  -  0 
8  -«<t<»  0 


with  probability  one 

/(Fe  -f,  )2ar9 


Hence, 

-  /<F8  -F#  )2dF9 


0 
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with  probability  one,  and  strong  consistency  of  9n  follows 
from  the  assumption. 

Theorem  2;  Assume  conditions  C  and  D  and  that  A  is  of  full 
rank.  Then,  if  f  Q{x)  is  continuous  in  9  at  dQ  for  every  x, 

Sn  (9n-90)  ^»N(0,  A-1BA-1) . 


Proof .  (Sketched) 
Set 


V5>  -  n^Gn-F80.;/'n)J<iFe0+CA'i  f°r  «  E  **■ 


Then  we  have 


Kn(C)  *  n/(Gn-F  -(F0  ?/yrn-F?  ))2dF5 


0  0 


.  nf  (Gn-PS()-  (Feo*5/.n‘F8o>  >  2«  |re0*E/.n‘F8o! 
■  o_(l)  +  J1  (U  ( t)  -  5  'n  (t)  -  r  (t))2dt, 


uniformly  in  C  for  <  C,  for  any  C  <  «,  where 

sup  1 R  ( t)  |  •*  0  with  probability  one,  also  uniformly  in  £ 
0<t<l 

for  <  C.  Here,  U(t)  =  /n (Gn (Fl1 (t) ) -t) ,  0  <  t  <  1. 

n  n  eQ 

By  an  extension  of  the  argument  of  Pyke  (1970,  p.  29-30)  to 

the  present  context,  we  obtain  that  the  limiting  law  of  the 

random  variable  minimizing  K  (C)  over  £  is  also  that  of  the 

n 

value  minimizing 

/ 1 (B ( t)  -  C'n (t) ) 2dt, 

0 


where  B  is  a  Brownian  bridge.  The  result  then  follows 
immediately. 


Zt  can  be  shown  that  the  mixture  of  normals  model  satisfies  the 
conditions  of  both  Theorem  1  and  Theorem  2. 


4.  Asymptotic  Relative  Efficiencies 

Theorem  2  of  the  previous  section  indicates  that  for 
the  mixture-of-normals  model,  we  have 

/n  (e  -ej  £+  n(o,a  ’W1)  , 

n  u 

2  2 

where  ^  »  lu^,  °i»  v2'°2'  p  ^  and  en  *s  vector  of 

corresponding  MD  estimators  using  Cramer-von  Hises  distance. 
Likewise,  it  is  well  known  that 

✓n  (0  -0O)  £  N(0,rL(90))  , 

where  ©L  is  the  MLE  of  andK80)is  Fisher's  information 

matrix.  We  will  employ  the  usual  terminology  and  refer  to 

A-1BA-1  and  I(0Q)  as  asymptotic  variance  -  covariance  matrices 

and  to  their  diagonal  elements  as  asymptotic  variances  of 

the  corresponding  estimators.  In  this  section  we  will 

present  computed  asymptotic  variances  for  the  MDE  of  p, 

* 

which  is  denoted  by  Pq,  and  compare  these  with  the  asymptotic 

a 

variances  associated  with  the  MLE,  denoted  by  • 

The  components  of  the  matrix  A  were  evaluated  using  the 
expression 

OD 

/  ^(xKjUJfgMdx  , 

where  FQ(x)  and  fg(x)  denote  the  distribution  function  and 
density  function  respectively  for  the  mixture,  is  the  ith 


component  of  6 ,  and 


Ci(x) 


3Fg  (X) 

36. 

l 


This  integral  was  evaluated  using  IMSL  subroutine  DCADRE 
which  employs  Romberg  extrapolation  to  perform  numerical 
integration  of  an  integral  over  a  finite  interval.  In  our 
implementation,  we  used  DCADRE  to  evaluate  the  integral 

L 

|  Ci(x)  Cj(x)  fe(x)dx, 

U 


where  L«min(-10a1+u1,-10o2  +  y2)  and  U^maxdOc^  +u1,10o2+  u2) 
with  maximum  allowable  absolute  error  specified  as 
1.0  X  10-15  and  relative  error  of  1.0  X  10-12  .  The  double 
integral 

•  9 

|  |  {  FQ  {min(x,y)  -  Fg  (x)Fg  (y)  (x)  Cj  (y)  fg  (x)  fQ  (y)dxdy 

—  00  —00 

involved  in  calculating  the  elements  of  the  matrix  B  is 
approximated  by  using  IMSL  subroutine  DBLIN  to  perform  a 
Romberg  integration  of  the  integral 

U  U 

j  J  {Fg  (min(x,y)  -  Fg  (x)  Fg  (y)  (X)  c  j  (y)  f  0  (X)  fg  (y)  dxdy 

L  L  ' 

with  maximum  allowable  absolute  errror  specified  as 
1.0  X  10-9  . 

The  calculation  of  the  information  matrix  for  the 
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mixture-of-normals  model  is  discussed  by  Behboodian (1972) . 
We  have  followed  Behboodian's  procedure  and  used 
Gauss-Hermite  quadrature  to  approximate  the  integrals 
involved.  Using  48-point  quadrature  we  obtain  good  agreement 
with  Behboodian's  tabled  results. 

a 

In  Table  3  we  display  the  asymptotic  variances  for  pD 

A 

and  pL  along  with  asymptotic  relative  efficiency  (ARE) 
calculated  as 

A 

asymptotic  variance  pL 
ARE  * - 

A 

asymptotic  variance  pD 

These  values  are  calculated  for  each  of  the  parameter 
configurations  employed  in  Table  1  for  the  normal  mixtures. 
As  in  Table  1,  the  asymptotic  results  indicate  that  the  MDE 
compares  more  favorably  with  the  MLE  when  p».5  while  its 
relative  performance  is  not  as  good  for  p*=.25  or  p*=.75. 


Table  3  -  Asymptotic  Relative  Efficiencies 


Overlap  ■ 

.10 

Overlap  ■  .03 

p 

Ratio 
of  Scale 
Factors (a) 

Asymptotic 

Variance 

ARE 

Asymptotic 

Variance 

ARE 

MDE 

13.60 

(7.80)* 

.42 

(.55) 

1 

.69 

(.49) 

.25 

1 

MLE 

5.67 
(A. 26) 

- 

MDE 

4.54 

(3.86) 

.65 

(.83) 

.398 

(.420) 

.89 

(.91) 

.50 

1 

MLE 

2.95 

(3.21) 

.355 

(.382) 

MDE 

18.77 

(5.30) 

.32 

(.42) 

.511 

(.956) 

.65 

(.51) 

.25 

r2 

MLE 

5.96 

(2.25) 

.330 

(.489) 

MDE 

3.49 

(2.79) 

.68 

(.86) 

.395 

(.441) 

.89 

(.94) 

.50 

G 

MLE 

2.39 

(2.41) 

.353 

(.416) 

MDE 

5.51 

(8.36) 

.58 

(.58) 

.420 

(1.08) 

.73 

(.44) 

.75 

MLE 

3.18 

(4.87) 

.305 

(.470) 

^Associated  Monte  Carlo  results  from  Table  1  are  given  in  parentheses. 


5.  Concluding  Remarks 


We  believe  that  the  results  of  this  paper  provide 
further  evidence  that  the  use  of  the  MDE  should  be 
considered  in  crop  proportion  estimation  procedures 
developed  by  NASA.  Our  results,  again,  and  more  conclusively 
than  before,  indicate  that  the  MDE  is  indeed  more  robust 
than  the  MLE  in  the  sense  that  it  is  less  sensitive  to 
symmetric  departures  from  the  underlying  assumption  of 
normality  of  component  distributions. 

Woodward  et.  al.  (1983)  have  investigated  basing  the  MD 
estimation  procedure  on  a  mixture  of  Weibull  components  in 
order  to  allow  for  possible  asymmetry  in  the  component 
distributions.  Their  results  indicate  that  this  approach 
provides  a  viable  alternative  to  the  normal-based  procedures 
discussed  here.  Research  is  also  proceeding  on  the  case  of 
multiple  (>2)  components  in  the  mixture. 

The  results  of  Section  4  indicate  that  the  MDE  does  not 
perform  as  well  as  would  be  hoped  when  the  data  actually  do 
arise  from  a  mixture-of-normals  model.  We  are  currently 
examining  the  use  of  the  Hellinger  metric  in  this  regard  due 
the  results  of  Beran(1977)  concerning  the  full  asymptotic 
relative  efficiency  of  minimum  Hellinger  distance 
estimators. 
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